logit value
Last Layer Logits to Logic: Empowering LLMs with Logic-Consistent Structured Knowledge Reasoning
Li, Songze, Liu, Zhiqiang, Gong, Zhaoyan, Guo, Xiaoke, Gui, Zhengke, Chen, Huajun, Zhang, Wen
Large Language Models (LLMs) achieve excellent performance in natural language reasoning tasks through pre-training on vast unstructured text, enabling them to understand the logic in natural language and generate logic-consistent responses. However, the representational differences between unstructured and structured knowledge make LLMs inherently struggle to maintain logic consistency, leading to \textit{Logic Drift} challenges in structured knowledge reasoning tasks such as Knowledge Graph Question Answering (KGQA). Existing methods address this limitation by designing complex workflows embedded in prompts to guide LLM reasoning. Nevertheless, these approaches only provide input-level guidance and fail to fundamentally address the \textit{Logic Drift} in LLM outputs. Additionally, their inflexible reasoning workflows cannot adapt to different tasks and knowledge graphs. To enhance LLMs' logic consistency in structured knowledge reasoning, we specifically target the logits output from the autoregressive generation process. We propose the \textit{Logits-to-Logic} framework, which incorporates logits strengthening and logits filtering as core modules to correct logical defects in LLM outputs. Extensive experiments show that our approach significantly improves LLMs' logic consistency in structured knowledge reasoning and achieves state-of-the-art performance on multiple KGQA benchmarks.
- Europe > Austria > Vienna (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (12 more...)
- Workflow (1.00)
- Research Report > New Finding (0.46)
- Leisure & Entertainment (1.00)
- Media > Music (0.93)
ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations
This paper introduces ThoughtProbe, a novel inference time framework that leverages the hidden reasoning features of Large Language Models (LLMs) to improve their reasoning performance. Unlike previous works that manipulate the hidden representations to steer LLM generation, we harness them as discriminative signals to guide the tree structured response space exploration. In each node expansion, a classifier serves as a scoring and ranking mechanism that efficiently allocates computational resources by prioritizing higher score candidates for continuation. After completing the tree expansion, we collect answers from all branches to form a candidate answer pool. We then propose a branch aggregation method that marginalizes over all supporting branches by aggregating their CoT scores, thereby identifying the optimal answer from the pool. Experimental results show that our framework's comprehensive exploration not only covers valid reasoning chains but also effectively identifies them, achieving significant improvements across multiple arithmetic reasoning benchmarks.
Supplementary Material: Relaxing Local Robustness
That is, Jia et al. provide a probabilistic guarantee that Equation A1 holds. In their evaluation, Jia et al. consider a point, We therefore stipulate that certification must be independent of the true label of the point being certified. While Jia et al. do not address this issue, one straightforward adaptation of their approach is to take Nets, which naturally satisfy affinity robustness on all non-rejected points. By the definition of y, we obtain (C2). Then, by applying (C6) we obtain (C9).
- Transportation (0.46)
- Aerospace & Defense > Aircraft (0.46)
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
Pre-trained large language models (LLMs) have been demonstrated to possess intrinsic reasoning capabilities that can emerge naturally when expanding the response space. However, the neural representation mechanisms underlying these intrinsic capabilities and approaches for their optimal utilization remain inadequately understood. In this work, we make the key discovery that a simple linear classifier can effectively detect intrinsic reasoning capabilities in LLMs' activation space, particularly within specific representation types and network layers. Based on this finding, we propose a classifier-guided search framework that strategically explore a tree-structured response space. In each node expansion, the classifier serves as a scoring and ranking mechanism that efficiently allocates computational resources by identifying and prioritizing more thoughtful reasoning directions for continuation. After completing the tree expansion, we collect answers from all branches to form a candidate answer pool. We propose a branch-aggregation selection method that marginalizes over all supporting branches by aggregating their thoughtfulness scores, thereby identifying the optimal answer from the pool. Experimental results show that our framework's comprehensive exploration not only covers valid reasoning chains but also effectively identifies them, achieving significant improvements across multiple arithmetic reasoning benchmarks.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Workflow (0.68)
- Research Report > New Finding (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Reducing the False Positive Rate Using Bayesian Inference in Autonomous Driving Perception
Melotti, Gledson, Bastos, Johann J. S., da Silva, Bruno L. S., Zanotelli, Tiago, Premebida, Cristiano
Object recognition is a crucial step in perception systems for autonomous and intelligent vehicles, as evidenced by the numerous research works in the topic. In this paper, object recognition is explored by using multisensory and multimodality approaches, with the intention of reducing the false positive rate (FPR). The reduction of the FPR becomes increasingly important in perception systems since the misclassification of an object can potentially cause accidents. In particular, this work presents a strategy through Bayesian inference to reduce the FPR considering the likelihood function as a cumulative distribution function from Gaussian kernel density estimations, and the prior probabilities as cumulative functions of normalized histograms. The validation of the proposed methodology is performed on the KITTI dataset using deep networks (DenseNet, NasNet, and EfficientNet), and recent 3D point cloud networks (PointNet, and PintNet++), by considering three object-categories (cars, cyclists, pedestrians) and the RGB and LiDAR sensor modalities.
- South America > Brazil > Espírito Santo (0.04)
- North America > United States > New York (0.04)
- Europe > Switzerland (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Transportation > Ground > Road (0.65)
- Information Technology > Robotics & Automation (0.51)
- Information Technology > Security & Privacy (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
SNAP: Efficient Extraction of Private Properties with Poisoning
Chaudhari, Harsh, Abascal, John, Oprea, Alina, Jagielski, Matthew, Tramèr, Florian, Ullman, Jonathan
Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model. Such attacks have privacy implications for data owners sharing their datasets to train machine learning models. Several existing approaches for property inference attacks against deep neural networks have been proposed, but they all rely on the attacker training a large number of shadow models, which induces a large computational overhead. In this paper, we consider the setting of property inference attacks in which the attacker can poison a subset of the training dataset and query the trained target model. Motivated by our theoretical analysis of model confidences under poisoning, we design an efficient property inference attack, SNAP, which obtains higher attack success and requires lower amounts of poisoning than the state-of-the-art poisoning-based property inference attack by Mahloujifar et al. For example, on the Census dataset, SNAP achieves 34% higher success rate than Mahloujifar et al. while being 56.5x faster. We also extend our attack to infer whether a certain property was present at all during training and estimate the exact proportion of a property of interest efficiently. We evaluate our attack on several properties of varying proportions from four datasets and demonstrate SNAP's generality and effectiveness. An open-source implementation of SNAP can be found at https://github.com/johnmath/snap-sp23.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Baselines for Identifying Watermarked Large Language Models
Tang, Leonard, Uberti, Gavin, Shlomi, Tom
Generated Text Detection Via Statistical Discrepancies Recent methods such as DetectGPT and GPTZero distinguish We consider the emerging problem of identifying between machine-generated and human-written text the presence and use of watermarking schemes by analyzing their statistical discrepancies (Tian, 2023; in widely used, publicly hosted, closed source Mitchell et al., 2023). DetectGPT compares the log probability large language models (LLMs). We introduce a computed by a model on unperturbed text and perturbed suite of baseline algorithms for identifying watermarks variations, leveraging the observation that text sampled from in LLMs that rely on analyzing distributions a LLM generally occupy negative curvature regions of the of output tokens and logits generated by model's log probability function. GPTZero instead uses watermarked and unmarked LLMs. Notably, watermarked perplexity and burstiness to distinguish human from machine LLMs tend to produce distributions text, with lower perplexity and burstiness indicating that diverge qualitatively and identifiably from a greater likelihood of machine-generated text.